Privacy Preserving Clustering on Distorted data
نویسنده
چکیده
In designing various security and privacy related data mining applications, privacy preserving has become a major concern. Protecting sensitive or confidential information in data mining is an important long term goal. An increased data disclosure risks may encounter when it is released. Various data distortion techniques are widely used to protect sensitive data; these approaches protect data by adding noise or by different matrix decomposition methods. In this paper we primarily focus, data distortion methods such as singular value decomposition (SVD) and sparsified singular value decomposition (SSVD). Various privacy metrics have been proved to measure the difference between original dataset and distorted dataset and degree of privacy protection. The data mining utility k-means clustering is used on these distorted datasets. Our experimental results use a real world dataset. An efficient solution is achieved using sparsified singular value decomposition and singular value decomposition, meeting privacy requirements. The accuracy while using the distorted data is almost equal to that of the original dataset. KeywordsPrivacy Preserving, Data Distortion, Singular Value Decomposition (SVD), Sparsified Singular Value Decomposition (SSVD), k--means clustering.
منابع مشابه
SVD based Data Transformation Methods for Privacy Preserving Clustering
Nowadays privacy issues are major concern for many government and other private organizations to delve important information from large repositories of data. Privacy preserving clustering which is one of the techniques emerged to addresses the problem of extracting useful clustering patterns from distorted data without accessing the original data directly. In this paper two hybrid data transfor...
متن کاملA Fuzzy Based Approach for Privacy Preserving Clustering
Extracting previously unknown patterns from huge volume of data is the primary objective of any data mining algorithm. In recent days there is a tremendous growth in data collection due to the advancement in the field of information technology. The patterns revealed by data mining algorithm can be used in various domains like Image Analysis, Marketing and weather forecasting. As a side effect o...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملData Security Using Decomposition
Protection of privacy from unauthorized access is one of the primary concerns in data use, from national security to business transactions. It brings out a new branch of data mining, known as Privacy Preserving Data Mining (PPDM). Privacy-Preserving is a major concern in the application of data mining techniques to datasets containing personal, sensitive, or confidential information. Data disto...
متن کاملPrivacy Preserving Clustering
The freedom and transparency of information flow on the Internet has heightened concerns of privacy. Given a set of data items, clustering algorithms group similar items together. Clustering has many applications, such as customerbehavior analysis, targeted marketing, forensics, and bioinformatics. In this paper, we present the design and analysis of a privacy-preserving k-means clustering algo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012